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Abstract- 

In recent years, image mining techniques enters 
and plays a vital role in various fields. The fast 
improvement in the information technology various 
methods has been appear to process and store these 
information, issues in data retrieval and huge volume. 
Image retrieval has been developed into a very dynamic 
explore the part will focus on how to extract and retrieve 
the images. An assortment of methods has been proposed 
for image retrieval and each technique has advantages 
and drawbacks. The difficulty in procedure and other 
problem involve the performance of existing system which 
makes inadequate. In this paper image retrieval with 
feature are extracted based on features such as contrast, 
energy, homogeneity and the threshold value calculated 
separately stored in feature database. The feature is 
generated and matching is done by Euclidean distance 
which is used to measure distance between two images. 
The experimental results shows that CSLBP method 
provides better retrieval rate when compared with the 
existing methods in terms of retrieval, precision and 
recall. 

Keywords-LBP, ILBP, MBLBP, CSLBP, Euclidean, 
Precision, Recall. 


1. INTRODUCTION 

Today, information has a great value and the amount of 
information has been expansively growing during last few years. 
Especially, text databases are rapidly growing due to the increasing 
amount of information available in electronic forms. 

Generally, data mining is the process of analyzing data 
from different perspectives and summarizing it into useful 
information. That type of Information can be used to increase 
revenue, cuts costs or both. It allows users to analyze data from many 
different dimensions or angles, categorize it and summarize the 
relationships identified. 




Fig 1.1 Steps in Data mining 


Data Mining Techniques 

There are several core techniques are used in data mining, 
to describe the type of mining and data recovery operation. The most 
common techniques used in the field of data mining are as follows. 

Association 

Association is a data mining function that discovers the 
probability of the co-occurrence of items in a collection. The 
relationships between co-occurring items are expressed as association 
rules. Association (or relation) is probably the better known and most 
familiar and straightforward data mining technique. Association 
makes a simple correlation between two or more items, often of the 
same type to identify patterns. 

Classification 

Classification is a data mining function that assigns items 
in a collection to target categories or classes. The goal of 
classification is to accurately predict the target class for each case in 
the data. For example, a classification model could be used to 
identify loan applicants as low, medium or high credit risks. 

Clustering 

Clustering is a data mining (machine learning) technique 
used to place data elements into related groups without advance 
knowledge of the group definitions. Popular clustering techniques 
include k-means clustering and expectation maximization (EM) 
clustering. Support Vector Machines that analyze data used for 
classification and regression analysis. In addition to performing 
linear classification, SVMs can efficiently perform a non-linear 
classification using what is called the kernel trick, implicitly mapping 
their inputs into high-dimensional feature spaces. When data are not 
labeled, supervised learning is not possible and an unsupervised 
learning approach is required, which attempts to find natural 
clustering of the data to groups and then map new data to these 
formed groups. Clustering is useful to identify different information 
because it correlates with other examples to see where the similarities 
and ranges agree. 

Artificial Neural Networks 

Artificial Neural Networks are relatively crude electronic 
models based on the neural structure of the brain. The brain basically 
learns from experience. It is natural proof that some problems that 
are beyond the scope of current computers are indeed solvable by 
small energy efficient packages. This brain modeling also promises a 
less technical way to develop machine solutions. Some Non-linear 
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predictive models learn through training and resemble biological 
neural networks in structure. 

1.2 Image Retrieval 

An image retrieval system is a computer system for 
browsing, searching and retrieving images from a large 
database of digital images. Most traditional and common 
methods of image retrieval utilize some method of adding 
metadata such as captioning, keywords or descriptions to the 
images so that retrieval can be performed over the annotation 
words. Manual image annotation is time-consuming, laborious 
and expensive, to address this, there has been a large amount of 
research done on automatic image annotation. Additionally, the 
increase in social web applications and the semantic web have 
inspired the development of several web-based image 
annotation tools. 

Image retrieval is an important topic in the field of 
pattern recognition and artificial intelligence. Generally 
speaking, there are three categories of image retrieval methods: 

i. Text-based 

ii. Content-based 

iii. Semantic-based 

The text-based approach the images need to be 
manually annotated by text descriptors which requires much 
human labor for annotation and the annotation accuracy is 
subject to human perception. Image retrieval is an extension to 
traditional information retrieval. Approaches to image retrieval 
are derived from conventional information retrieval and are 
designed to manage the more versatile and enormous amount of 
visual data that exist. 

Low-level visual features such as color, texture, shape 
and spatial relationships are directly related to perceptual 
aspects of image content. Since it is usually easy to extract and 
represent these features and fairly convenient to design 
similarity measures by using the statistical properties of these 
features, a variety of content-based image retrieval techniques 
have been proposed in the past few years. 

Image retrieval systems attempt to search through a 
database to find images that are perceptually similar to a query 
image. CBIR is an important alternative and complement to 
traditional text-based image searching and can greatly enhance 
the accuracy of the information being retrieved. It aims to 
develop an efficient visual-content-based technique to search, 
browse and retrieve relevant images from large-scale digital 
image collections. 

1.3 Content Based Image Retrieval (CBIR) 

The term Content-Based Image Retrieval (CBIR) 
seems to have originated in 1992, when it was used by T. Kato 
to describe experiments into automatic retrieval of images from 
a database, based on the colors and shapes present. Since then, 
the term has been used to describe the process of retrieving 
desired images from a large collection on the basis of 
syntactical image features. 

Content-based image retrieval has become a prominent 
research topic because of the proliferation of video and image 
data in digital form. The main goal of CBIR resides in its 
efficiency during image indexing and retrieval, thereby 
reducing the need for human intervention in the indexing 


process. The computer must be able to retrieve images from a 
database without any human assumption on specific domain. 
The fundamental operation applied on the image databases are 
matching and determining whether the data is present or not. 
Matching is not expressive enough for multimedia data and 
database systems. 

Various systems have been introduced for content- 
based image retrieval (CBIR) systems that operate in two 
phases: indexing and searching. In the indexing phase, each 
image of the database is represented using a set of image 
attribute, such as texture and layout. The extracted features are 
stored in a visual feature database. In the searching phase, when 
a user makes a query, a feature vector for the query is 
computed. Using a similarity criterion, this vector is compared 
to the vectors in the feature database. The image most similar to 
the query (or images for range query) is returned to the user. 
Visual feature extraction is the basis of any content-based 
image retrieval technique. Widely used features include color, 
texture, shape and spatial relationships. 

Texture based retrieval 

In general, matching of texture based image is carried 
out with the similarity between the areas of the images with 
similar texture. Various techniques have been used for 
measuring texture similarity is by calculating the relative 
brightness of selected pairs of pixels from each image. From 
these it is possible to compute some measures for the texture 
images such as the degree of contrast, coarseness, directionality, 
regularity or periodicity and randomness. Texture queries can 
be formulated in a similar manner to color image queries, by 
selecting examples of desired textures from a palette or by 
supplying a query image. The system then retrieves images with 
these texture measures that are close to the query image. 

Edge based retrieval 

The edges in an image are usually referred as abrupt 
changes in some physical properties, geometrical illumination 
and reflectivity. Mathematically, a discontinuity may be 
involved in the function representing physical properties. 
Various methods have been proposed to extract the specific 
features of edges. Once the edge map has been arrived from the 
query image, the edge features are extracted and stored in the 
feature database for the image retrieval. In order to improve the 
efficiency of image retrieval system with low-level features, 
edge features are extracted and included, since salient features 
are embedded in the edges. 

Shape based retrieval 

The ability to retrieve images based on shape is 
perhaps the most obvious requirement at the primitive level. 
Unlike texture, shape is a fairly well defined concept and there 
is considerable evidence that natural objects are primarily 
recognized by their shape. Queries are then answered by 
computing the same set of features for the query image and 
retrieving those stored images whose features are most closely 
match to the query. 
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Color based retrieval 

Several methods for retrieving images on the basis of 
color have been described, but most of the methods use the 
same basic principle. Each image added to the collection is 
analyzed to compute a color histogram, which shows the 
proportion of each color pixels within the image. The color 
histogram for each image is then stored in the database. The 
matching process retrieves images whose color histograms are 
similar to the query image. 

Semantic based retrieval 

Semantic based retrieval is a high-level image retrieval 
system. In the semantic based retrieval technique, semantic 
meanings are used to retrieve relevant images. Typically, 
certain form of knowledge base is required in the semantic 
based retrieval systems. The ideal CBIR system from a user 
perspective would involve what is referred to as semantic 
retrieval, where the user makes a request like find pictures of 
dogs. This type of open ended task is very difficult for the 
computers to complete. Semantic analysis is also considered in 
the biometric system to recognize the objects. 

2. RELATED WORKS 

Ahonen et al., [1] proposed an efficient image 
representation based on local binary pattern texture features. 
The image is divided into several regions from which the 
LBP feature distributions are extracted and concatenated into 
an enhanced feature vector to be used as descriptor. 

Felicitas et.al. [2] proposed a fuzzy index for edge 
evaluation without considering a binarization step. In order to 
process all detected edges, images are represented in their 
fuzzy form and all calculations are made with fuzzy set 
operators between the images to be compared. By using these 
metrics synthetic images will give better results and it is not 
used for real images. 

Content Based Image Retrieval (CBIR) is a 
technique used for extracting relevant images from the image 
database based on the input query image. The most 
challenging aspect of CBIR is to bridge a gap between low- 
level feature and high-level features. In the early works, 
Query-By-Image-Content (QBIC) was the first CBIR system 
[3]. 

Heikkil et al.,[4] discussed an efficient texture-based 
method for modeling the background and detecting moving 
objects from a video sequence. Each pixel is modeled as a 
group of adaptive local binary pattern histograms that are 
calculated over a circular region around the pixel. The 
approach provides us with many advantages compared to the 
other methods. Experimental results clearly justify this model. 

Huang et al., [5] developed a method based on 
accurate localization of representative points which is crucial 
to many analysis and synthesis problems. Active shape model 
is a powerful statistical tool for alignment. However, it suffers 
from variations of pose, illumination and expressions. To 
analyze the mechanism of active shape model, to realize the 
ability of normal profiles and to describe the local appearance 
pattern is very limited. For efficient appearance pattern 
representation, the local binary pattern is used and extended 
to describe the local patterns of key points. 


Masily [6] developed this method which is very 
similar to that of LBP. The only difference is that vicinity 
pixels lie on an ellipse relating to the central pixel rather than 
on a circle. 

Ojala et al., [7] used three standard approaches to 
automatic texture classification which make use of features 
based on the Fourier power spectrum, first-order statistics of 
gray level differences and second-order gray level statistics. 
Feature sets of these types, all designed analogously, and 
were used to classify two sets of terrain samples. It was found 
that the Fourier features generally performed more poorly, 
while the other feature sets all performed comparatively well. 

The photo book system is a set of interactive tools 
for browsing and searching images [8]. It consists of three 
sub-books they are the appearance photo book, shape photo 
book and texture photo book, which can extract the shape and 
texture, respectively. Users can query for an image based on 
the corresponding features in each of the three sub-books or 
on a combination of different mechanisms with a text-based 
description. 

Pooja et al. [9] developed a canny and Sobel edge 
detection algorithm for extracting the shape features from the 
images. After extracting the shape feature, the classified 
images are indexed and labeled for retrieval of the images 
from the smaller image database. 

Rong et al. [10] describes that bridging the semantic 
gap between the low-level features and the high-level 
semantics is within the interface between the user and the 
system, other research direction is towards improving aspects 
of CBIR systems by finding the latent correlation between 
low-level visual features and high-level semantics and 
integrating them into a unified vector space model. 

Rui. et al., [11] discussed a comprehensive survey of 
the technical achievements in the research area of image 
retrieval, especially content-based image retrieval, an area 
that has been so active and prosperous in the past few years. 
The survey covering the research aspects of the three 
fundamental bases of content-based image retrieval namely 
image feature representation and extraction, multidimensional 
indexing and system design. Furthermore, based on the state- 
of-the-art technology available now and the demand from 
real-world applications, open research issues are identified 
and future promising research directions are suggested. 

Smeulders et al., [12] presented a review of 200 
references in content-based image retrieval and the working 
conditions of content-based retrieval: patterns of use, types of 
pictures, the role of semantics and the sensory gap. This 
review focuses on image processing for retrieval sorted by 
color, texture and local geometry. Features for retrieval are 
discussed next, sorted by: accumulative and global features, 
salient points, object and shape features, signs and structural 
combinations thereof. Similarity of pictures and objects in 
pictures is reviewed for each of the feature types, in close 
connection to the types and means of feedback the user of the 
systems is capable of giving by interaction. 

Smith et al., [13] discussed a digital image and video 
libraries require new algorithms for the automated extraction 
and indexing of salient image features. Texture features 
provide one important cue for the visual perception and 
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discrimination of image content. They used this approach for 
automated content extraction that allows efficient database 
searching using texture features. The algorithm automatically 
extracts texture regions from image spatial-frequency data 
which are represented by binary texture feature vectors. 

Zhao et al., [14] extended the LBP to the completed 
modeling of local binary patterns (CLBP), which is composed 
of the center gray level, sign components and magnitude 
components. The authors concluded that the CLBP has better 
texture feature extraction capabilities than the standard LBP. 

3. Existing Methodology 

The image retrieval includes several techniques such 
as filtering, feature extraction and classification of image. 

3.1 Local Binary Pattern (LBP) 

A Local binary pattern (LBP) is a type of feature 
used for classification in computer vision. LBP [8] was first 
described in 1994. It has since been found to be a powerful 
feature for texture classification based on the assumption that 
texture has locally two complementary aspects of a pattern 
and its strength. The basic version of the local binary pattern 
operator works in a 3 x 3 pixel block of an image. The pixels 
in this block are threshold by its center pixel value, multiplied 
by powers of two and then summed to obtain a label for the 
center pixel. 

3.2 Improved Local Binary Pattern (ILBP) 

Jin et al. [17] pointed out that LBP could miss the 
local structure information under some circumstances. For 
instance, LBP operator can only get 256 of all 511 patterns 
for a 3x3 neighborhood, as the central pixel is not considered. 
In order to obtain the complete information, they proposed an 
Improved LBP (ILBP) which compares all the pixels 
(including central pixel) with the mean of all the pixels in the 
kernel. Later ILBP was extended to the neighborhoods of any 
sizes instead of the original 3x3 [16], 

3.3 Multi Block Local Binary Pattern (MBLBP) 

Multi Block Local Binary Pattern is used to obtain 
texture pattern for every pixel by considering a local region of 
size 3 x 3, 9 x 9, 15 x 15 etc. with center pixel. Computation 
of MBLBP for 3 x 3 local region is equivalent to the ordinary 
LBP. Local region of other sizes can be decomposed into 
equally sized regions. Hence, the average sum of pixel 
intensity for every sub regions is calculated which is then 
threshold with the center region average value. MBLBP 
values are computed in a similar manner as in LBP which 
exhibits more distinctive features. 

4. Proposed Methodology 

The texture features are extracted from the input 
image. The texture feature extraction is an important process 
to make efficient retrieval. Though various models and 
methods are available, they are not sufficient for providing 
accuracy in retrieval process. The important steps involved in 
the proposed technique are identification and localization of 


block wise features of the image. The extraction of 
geometrical image features in local binary pattern. The 
proposed CS-LBP local binary pattern technique is 
experimented. There by a novel technique for image retrieval 
using texture feature is proposed. 

4.1 Center Symmetric Local Binary Pattern 

The recognition of object in PASCAL database. The 
original LBP was very long its feature is not robust on flat 
images. In this method, instead of comparing the gray level 
value of each pixel with the center pixel, the center symmetric 
pairs of pixels are compared. CS-LBP is closely related to 
gradient operator. It considers the grey level differences 
between pairs of opposite pixels in a neighborhood. So CS- 
LBP take advantage of both LBP and gradient based features. 

4.2 Feature Extraction 

The features are located to compute the feature sets 
for classification. Here five feature sets are calculated for 
feature extraction. The feature set 1 are contrast of the image, 
feature set 2 is correlation features, feature set 3 is energy 
features of an image, feature set 4 is entropy image features 
and feature set 5 is homogeneity features of the image. 

Contrast Feature Set 

Contrast measures how the values of the matrix are 
distributed and number of local changes reflecting the image 
clarity and texture of shadow depth. Large Contrast represents 
deeper texture. The feature set is generated with the contrast 
by the equation 3.9 for the image block. The feature set of the 
input image under analysis is represented as follows, 

Faitu-csctcmtm = II(k -m) 2 V(k,m) .(4.1) 

Correlation Feature Set 

The feature set is generated with the 
correlation feature of the blocks of the input image under 
analysis and is computed as follows, 

X(k-pi)(m-it)V(k,n) 

=—- j - •••• ( 4 - 2 ) 

CT 

Energy Feature Set 

The feature set is consisting of a texture 
feature based on energy contributed by all image blocks. The 
energy computed by equation 3.8 

Featureset Energy = XX V(k ’.( 43 ) 

k m 

Entropy Feature Set 

The feature set is generated with the 
entropy as a measure for all the image blocks. Entropy 
measures the randomness in the image texture. A minimum 
entropy value indicates that the co-occurrence matrix values 
are uniform. Then, the maximum entropy implies that the 
gray distribution in the image is random. The feature set of 
the input image under analysis is represented as follows, 

Fctiturcsct nmropy = XX V(k ‘ m) k>8 V(k ’ m) 

k m 
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Homogeneity Feature Set 

The feature set is generated with the 
homogeneity measure for all the block images of the input 
image under analysis and computed as follows, 

Z V (k, m) 

*- m 1+ | k - m | . 4 ' 5 

Where 

V is co-occurence matrix and 


(k, m) is gray-level value at the Coordinate 
p= kV(k,m) (weighted pixel average) 
a =weighted pixel variance 
Finally the feature database is established to store 
the feature set of all the images available in IDB. The final 
feature set/vector is formed by the feature values derived by 
the equations 4.1 to 4.5 and represented as below 


Fealureset, 


CSLBPF" 


f^umset^^gy, Fhitureset^n^t > 
Featureset^tr^y, Ratureset Q)iTe|al|on , 


Ffeatureset 


Homogeneity 


.... (4.6) 


5. ALGORITHM 

The process of the image retrieval takes place in two 
phases and defined as algorithm I and II. 

Algorithm I 

// generating feature sets // 

Input: Input image of size (M x N) from IDB. 

Output: Feature database. 

Begin 

Stepl: Read an image from the image database (IDB) of 
size. 

Step2: Partitioning the input image into k non-overlapped 
blocks , each of size (n x n). 

Step 3: Perform procedure_ threshold () 

Step4: Repeat Step 2 through step3 for all blocks of the input 
image. 

Step5: Generate feature set as mentioned in equation 4.6. 
Step6: Store the feature set into the feature database. 

Step7: Repeat Step 1 through Step 6 for all the images in IDB. 
End 

Algorithm II 

//Retrieving top m relevant images corresponding to the 
target image // 

Input: Target Image (T;) of size (M x N) and images from IDB 
Output: List the top m relevant images corresponding to the 
target image. 

Stepl: Read the Target image (Tf. 

Step2: Partitioning the Target image by k non-overlapped 

blocks of size (n x n) 

Step3: Perform procedure threshold^feature () 

Step4: Repeat Step 2 through Step 3 for all blocks of the 
target image. 

Step5: Generate feature set as mentioned in equation 4.6. 
Step6: Perform procedure Euclidean_dist ( ) 

{ 


Compute the distance measures for number of 
images from IDB with the target image using the 
equation 4. 7. 

} 

Step7: Retrieve the top m relevant images from the image 
database. 

End 

Procedure _ threshold ( ) 

l 

Stepl: Input M, N //size of input image 

Step 2: Read the image with even row and column 

Step 3: Convert gray scale values into matrix. 

Step 4: Apply sorting for an array by using step 3. 

Step5: Find out the middle gray scale values of lower 
range and upper range. 

Step6: Find out the average value of middle gray scale values 
and take whole number in sorted array and also known 
as threshold value. 

Step7: Convert binary matrix by using threshold value. 

Step8: Repeat step 3 to step 7 for all images in the 
database. 

Step9: Return 

} 


6. Experiments and Results 

The proposed feature extraction is experimented 
with the images collected from the standard database CORAL 
consisting of 1000 images as shown in fig.6.1 and generated 
feature set images considered for this experiment are of the 
size. 



Fig.6.1 Sample Images 

Euclidean Distance 


To find the similarity measures between the images, 
various metrics are used to measure the distance between 
features of the images. Some of the well known distance 
metrics used in for image retrieval is presented below. The 
Euclidean Distance is calculated as below 

d E {x v x 2 )=\ I f j (x l [i)-x 2 [i)'f 

» '=' ...(4.7) 

Where xl(i) is the feature vector of input image i 
and x2(i) is the feature vector of the target image i in the 
image database. 

In the texture based image retrieval system 
Euclidean distance is used to find the distance between the 
features vectors of the target image and each of the image in 
the image database. The difference between two images can 
be expressed as the distance‘d’ between the respective feature 
vectors Fs(Ii) and Fs(It). From the given input image Ii and 
the target image It the Euclidean Distance is calculated as, 
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' ,=1 ...(4.8) 

Where Fs(Ii) is the feature set of the input image Ii, 
Fs(It) is the n-dimensional feature vector of the target image 
It respectively. 

The performance of a retrieval system can be 
measured in terms of its recall and precision. 

_ ,, Number of relevant images retrieved 

Recall =. . . .(4.9) 

Total Number of relevant images 

„ . . Number of relevant images retrieved , , „ 

Precisron =.-. .(4.10) 

Total Number of images retrieved 


Category 

LBP 

ILBP 

MBLBP 

Proposed 

CSLBP 

Buses 

77.14 

77.10 

81.56 

82.57 

Dinosaurs 

78.02 

77.93 

79.31 

83.29 

Elephants 

50.74 

62.96 

68.55 

68.94 

Flowers 

80.56 

84.91 

69.90 

72.71 


Table 6.1 Comparison Results in terms of Precision 

From the above Table 6.1 shows the precision for 
the proposed technique and existing technique respectively. 
Hence, the proposed technique is also efficient for image 
retrieval. 


Category 

LBP 

ILBP 

MBLBP 

Proposed 

CSLBP 

Buses 

71.20 

73.17 

75.87 

80.27 

Dinosaurs 

80.34 

82.28 

83.38 

85.59 

Elephants 

28.81 

31.36 

31.21 

32.01 

Flowers 

64.35 

69.13 

70.56 

71.08 


Table 6.2 Comparison Results in terms of Recall 

From the above Table 6.2 shows the recall for the 
proposed technique and existing technique respectively. 
Hence, the proposed technique is also efficient for image 
retrieval. 


Model 

Retrieval Rate 

LBP 

72.61 

ILBP 

76.97 

MBLBP 

75.33 

Enhanced CSLBP 

78.87 


Table.6.3 Image Retrieval Rate 

The Table 6.3 shows that recognition percentage of 
the query images with CSLBP. The experimental results 
show that the CSLBP produces higher retrieval accuracy of 
78.87%. The performance was evaluated using the Euclidean 
distance classification is analyzed and proposed CSLBP 
method is better for image retrieval. 


7. Conclusion 

In this paper, enhanced centre symmetric local binary 
pattern based image retrieval with block wise texture features 
has been proposed. The feature vector of the images in IDB is 
generated using the proposed technique and a feature database 
is established. The Euclidean distance has been computed to 
measure the similarity between the images based on the 
distance the images are retrieved. The CSLBP method produces 
better retrieval results with 78.87% accuracy compared with 
existing methods where Local Binary Pattern, Improved Local 
Binary Pattern and Multi-block Local Binary Pattern. The 
proposed CSLBP is experimented and compared with existing 
models the proposed technique gives better results. 
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